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Abstract —In this paper, we introduce a new framework for 
robust multiple signal classification (MUSIC). The proposed 
framework, called robust measure-transformed (MT) MUSIC, is 
based on applying a transform to the probability distribution of 
the received signals, i.e., transformation of the probability mea¬ 
sure defined on the observation space. In robust MT-MUSIC, the 
sample covariance is replaced by the empirical MT-covariance. 
By judicious choice of the transform we show that: (1) the 
resulting empirical MT-covariance is B-robust, with bounded 
influence function that takes negligible values for large norm 
outliers, and (2) under the assumption of spherically contoured 
noise distribution, the noise subspace can be determined from 
the eigendecomposition of the MT-covariance. Furthermore, we 
derive a new robust measure-transformed minimum description 
length (MDL) criterion for estimating the number of signals, and 
extend the MT-MUSIC framework to the case of coherent signals. 
The proposed approach is illustrated in simulation examples that 
show its advantages as compared to other robust MUSIC and 
MDL generalizations. 

Index Terms —Array processing, DOA estimation, probability 
measure transform, robust estimation, signal subspace estimation. 


I. Introduction 

The multiple signal classification (MUSIC) algorithm [1], 
[2] is a well established technique for estimating direction-of- 
arrivals (DOAs) of signals impinging on an array of sensors. 
It operates by finding DOAs with corresponding array steering 
vectors that have minimal projections onto the empirical noise 
subspace, whose spanning vectors are obtained via eigende¬ 
composition of the sample covariance matrix (SCM) of the 
array outputs. 

In the presence of outliers, possibly caused by heavy-tailed 
impulsive noise, the SCM poorly estimates the covariance of 
the array outputs, resulting in unreliable DOAs estimates. In 
order to overcome this limitation, several MUSIC generaliza¬ 
tions have been proposed in the literature that replace the SCM 
with robust association or scatter matrix estimators, for which 
the empirical noise subspace can be determined from their 
eigendecomposition. 

Under the assumption that the signal and noise components 
are jointly a-stable processes [3], it was proposed in [4] 
to replace the SCM with empirical covariation matrices that 
involve fractional lower-order statistics. Although a-stable 
processes are appropriate for modelling impulsive noise [5], 
the assumption that the signal and noise components are 
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jointly a-stable is restrictive. In [6], a less restrictive approach 
considering circular signals contaminated by additive a-stable 
noise was developed that replaces the SCM with matrices com¬ 
prised of empirical fractional-lower-order-moments. Although 
this approach is less restrictive than the one proposed in [4], 
violation of the signal circularity assumption, e.g., in the case 
of BPSK signals, results in poor DOA estimation performance 
[ 6 ]. 

In [7]-[9] it was proposed to apply MUSIC after passing the 
data through a zero-memory non-linear (ZMNL) function that 
suppresses outliers by clipping the amplitude of the received 
signals. The ZMNL approach has simple implementation 
having low computational complexity, and unlike the methods 
proposed in [4], [6] it does not require restrictive assumptions 
on the signal and noise probability distributions. Although 
the ZMNL preprocessing may result in more accurate DOA 
estimation than the methods in [4], [6], it may not preserve 
the noise subspace which can lead to performance degradation 
[ 10 ]. 

Under the assumption of normally distributed signals in 
heavy tailed noise, a similar approach was proposed in [11] 
that is based on successive outlier trimming until the remaining 
data is Gaussian. Normality of the data is tested using the 
Shapiro-Wilk’s test. Similarly to the ZMNL preprocessing, 
the noise subspace may not be preserved after the trimming 
procedure. Moreover, the key assumption that the signals are 
Gaussian may not be satisfied in some practical scenarios. 

In [12], a different MUSIC generalization was proposed that 
replaces the SCM with empirical sign or rank covariances. 
Using only the assumption of spherically distributed noise, it 
was shown that convergent estimates of the noise subspace 
can be obtained from their eigendecomposition. The influence 
functions [13] of the empirical sign and rank covariance 
matrices, that measure their sensitivity to an outlier, are 
bounded [14], In other words, these estimators are B-robust 
[13], However, it can be shown that the Frobenius norms of 
their matrix valued influence functions do not approach zero 
as the magnitude of the outlier approaches infinity, i.e., they 
do not reject large outliers. Indeed, the empirical sign and rank 
covariance matrices have influence functions with constant 
Frobenius norms for spherically symmetric distributions. 

In [15], robust M-estimators of scatter [16], [17], such as 
the maximum-likelihood, Huber’s [17], and Tyler’s [18] M- 
estimators, extended to complex elliptically symmetric (CES) 
distributions, were proposed as alternatives to the SCM. Under 
the class of CES distributions having finite second-order 
moments, these estimators provide consistent estimation of 
the covariance up to a positive scalar, resulting in consistent 
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estimation of the noise subspace. Although this approach 
can provide robustness against outliers with negligible loss 
in efficiency when the observations are normally distributed, 
it may suffer from the following drawbacks. First, when 
the observations are not elliptically distributed, M-estimators 
may lose asymptotic consistency [19], which may lead to 
poor estimation of the noise subspace. Second, M-estimators 
of scatter are often computed using an iterative fixed-point 
algorithm that converges to a unique solution under some 
regularity conditions. Each iteration involves matrix inversion 
which may be computationally demanding in high dimensions, 
or unstable when the scatter matrix is close to singular. More¬ 
over, although the influence functions of M-estimators may be 
bounded, they may not behave well for large norm outliers that 
can negatively affect estimation performance. Indeed, similarly 
to the method of [12], Tyler’s scatter M-estimator does not 
reject large outliers and its matrix valued influence function 
[15] has constant Frobenius norm for spherically symmetric 
distributions. 

In [20], a robust MUSIC generalization called Z p -MUSIC 
was proposed that estimates the noise subspace by minimizing 
the lp-norm (1 < p < 2) of the residual between the data 
matrix and its low-rank representation. The Z p -norm with 
p < 2 is less sensitive to outliers than the / 2 -norm. Therefore, 
Zp-MUSIC is more robust against impulsive noise as compared 
to MUSIC that is based on / 2 -norm minimization of the 
data fitting error matrix [20]. However, unlike MUSIC and 
other robust generalizations, in Z p -MUSIC the empirical noise 
subspace is not determined by solving a simple eigendecom- 
position problem. Indeed, in [20] the non-convex Z p -norm 
minimization is performed by alternating convex optimization 
scheme that may converge to undesired local minima. 

In this paper, we introduce a new framework for robust 
MUSIC. The proposed framework, called robust measure- 
transformed MUSIC (MT-MUSIC), is inspired by a measure 
transformation approach that was recently applied to canonical 
correlation analysis [21] and independent component analysis 
[22], Robust MT-MUSIC is based on applying a transform 
to the probability distribution of the data. The proposed 
transform is defined by a non-negative function, called the 
MT-function, and maps the probability distribution into a set of 
new probability measures on the observation space. By modi¬ 
fying the MT-function, classes of measure transformations can 
be obtained that have useful properties. Under the proposed 
transform we define the measure-transformed (MT) covariance 
and derive its strongly consistent estimate, which is also shown 
to be Fisher consistent [23]. Robustness of the empirical 
MT-covariance is established in terms of boundedness of its 
influence function. A sufficient condition on the MT-function 
that guarantees B-robustness of the empirical MT-covariance 
is also obtained. 

In robust MT-MUSIC, the SCM is replaced by the empirical 
MT-covariance. The MT-function is selected such that the 
resulting empirical MT-covariance is B-robust, and the noise 
subspace can be determined from the eigendecomposition of 
the MT-covariance. By modifying the MT-function such that 
these conditions are satisfied a class of robust MT-MUSIC 
algorithms can be obtained. 


Selection of the MT-function under the family of zero- 
centered Gaussian functions, parameterized by a scale param¬ 
eter, results in a new algorithm called Gaussian MT-MUSIC. 
We show that the empirical Gaussian MT-covariance is B- 
robust with influence function that approaches zero as the 
outlier magnitude approaches infinity. Under the additional 
assumption that the noise component has a spherically con¬ 
toured distribution, we show that the noise subspace can be 
determined from the eigendecomposition of Gaussian MT- 
covariance. Note that this spherically contoured noise distri¬ 
bution assumption is weaker than the standard i.i.d. Gaussian 
noise assumption. We propose a data-driven procedure for 
selecting the scale parameter of the Gaussian MT-function. 
This procedure has the property that it prevents significant 
transform-domain Fisher-information loss when the observa¬ 
tions are normally distributed. 

In this paper, a robust estimate of the number of signals 
is proposed that is based on minimization of a measure- 
transformed version of the minimum description length (MDF) 
criterion [24]. This criterion, called MT-MDF, is obtained by 
replacing the eigenvalues of the SCM with the eigenvalues of 
the empirical MT-covariance. We show that under some mild 
conditions, minimization of the MT-MDF criterion results in 
strongly consistent estimation of the number of signals regard¬ 
less the underlying distribution of the data. These conditions 
are satisfied when the Gaussian MT-function is implemented 
and the noise component has a spherically contoured distribu¬ 
tion. The MT-MDF criterion with the Gaussian MT-function 
is called the Gaussian MT-MDF. 

The proposed Gaussian MT-MUSIC algorithm is extended 
to the case of coherent signals impinging on a uniform linearly 
spaced array (UFA) [25]. This extension is carried out through 
forward-backward spatial smoothing of the empirical Gaussian 
MT-covariance matrix. 

The Gaussian MT-MUSIC algorithm and the Gaussian MT- 
MDF criterion are evaluated by simulations to illustrate their 
advantages relative to other robust MUSIC and MDF general¬ 
izations. We examine scenarios of non-coherent and coherent 
signals contaminated by several types of spherically contoured 
noise distributions arising from the compound Gaussian (CG) 
family. This family encompasses common heavy-tailed distri¬ 
butions, such as the /-distribution, the /f-distribution, and the 
CG-distribution with inverse Gaussian texture, and have been 
widely adopted for modeling radar clutter [26]-[29], 

The paper is organized as follows. In Section II, the 
robust MT-MUSIC framework is presented. In Section III, 
the Gaussian MT-MUSIC algorithm is derived. In Section 

IV, we propose a measure-transformed generalization of the 
MDF criterion for estimating the number of signals. In Section 

V, a spatially smoothed version of the Gaussian MT-MUSIC 
algorithm for coherent signals is developed. The proposed 
approach is illustrated by simulation in Section VI. In Section 
VII, the main points of this contribution are summarized. The 
proofs of the propositions and theorems stated throughout the 
paper are given in the Appendix. 
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II. ROBUST MEASURE-TRANSFORMED MUSIC a transform on P x is defined via the relation: 


In this section, the robust MT-MUSIC procedure is pre¬ 
sented. First, the sensor array model is introduced. Second, a 
general transformation on probability measures is established. 
Under the proposed transform, we define the MT-covariance 
matrix and derive its strongly consistent estimate. Robustness 
of the empirical MT-covariance is studied by analyzing its 
influence function. Finally, based on the assumed array model, 
we propose a robust MT-MUSIC procedure that replaces the 
SCM with the empirical MT-covariance of the received signals. 

A. Array model 

Consider an array of p sensors that receive signals generated 
by q < p narrowband incoherent far-field point sources with 
distinct azimuthal DOAs 6\,...,9 q . Under this model, the 
array output satisfies [2]: 

X(n) = AS(n)+W(n), (1) 

where n £ N is a discrete time index, X (n) £ C p is 
the vector of received signals, S (n) £ C q is a zero-mean 
latent random vector, comprised of the emitted signals, with 
non-singular covariance, and W (n) £ C p is an additive 
spatially white noise with zero location parameter. The matrix 
A = [a {6-f) ,..., a (0 g )] £ C pxq is the array steering 
matrix, where a (0) is the steering vector of the array toward 
direction 6. We assume that the array is unambiguous, i.e., 
any collection of p steering vectors corresponding to distinct 
DOAs forms a linearly independent set. Therefore, A has 
full column rank, and identification of its column vectors is 
equivalent to the problem of identifying the DOAs. We also 
assume that S (n) and W (n) are statistically independent and 
first-order stationary. To simplify notation, the time index n 
will be omitted in the sequel except where noted. 


B. Probability measure transform 

We define the measure space (X,S X , P x ), where X is the 
observation space of a random vector X £ C p , S x is a a- 
algebra over X, and P x is a probability measure on S x . Let 
g : X —>• C denote an integrable scalar function on X. The 
expectation of g (X) under P x is defined as 

E [g (X); P x ] ~ J 9 ( x ) dP*. ( x )) (2) 

X 

where x £ X. The empirical probability measure P x given 
a sequence of samples X(n), n = 1 from P x is 

specified by 

1 N 

( 3 ) 

n =1 

where A £ S x , and <5 x ( n ) (•) is the Dirac probability measure 
at X(n) [30]. 

Definition 1. Given a non-negative function u : C p — > K + 
satisfying 

( 4 ) 


Qx'* {A) = T u [P x ] (A) = y^ u (x)dP x (x), (5) 


where A £ S x , x £ X, and 
Pu (x) = 


A u(x) 


E [w (X); P x ]' 


( 6 ) 


The function u (•), associated with the transform T u [■], is 
called the MT-function. 

Proposition 1 (Properties of the transform). Let C/'f ' 1 be 
defined by relation (5). Then 

1 ) Qx' 1 is a probability measure on S x . 

(u) 

2) Qx is absolutely continuous w.r.t. P x , with Radon- 
Nikodym derivative [30]: 


dQ x ) (x) 
dP x (x) 


Pu (x) . 


( 7 ) 


3) Assume that the MT-function u (•) is strictly positive, and 
let g : X —>• C m denote an integrable function over 
X. If the covariance of g (X) under P x is non-singular, 

then it is non-singular under the transformed probability 

Xu) 

measure Qx . 

[A proof is given in Appendix A ] 

(u) 

The probability measure Q x is said to be generated by the 
MT-function u (•). By modifying u (•), such that the condition 
(4) is satisfied, virtually any probability measure on S x can 
be obtained. 


C. The measure-transformed covariance 

( U ) 

According to (7) the covariance of X under Q x is given 
by 

= E 

where 

Mx } = E [X<p„ (X); P x ] (9) 

(u) 

is the expectation of X under (f x . Equation (8) implies that 
j s a weighted covariance matrix of X under P x , with 
weighting function ip u (•). Hence, can be estimated using 
only samples from the distribution P x . By modifying the MT- 
function u (•), such that the condition (4) in definition 1 is 
satisfied, the MT-covariance matrix under (ff ’ is modified. 
In particular, by choosing u (x) = 1, we have = Px, for 
which the standard covariance matrix E x is obtained. 

In the following proposition, a strongly consistent estimate 
of is constructed, based on N i.i.d. samples of X. 

Unlike the empirical MT-covariance proposed in [21], the 
construction is based on complex observations and its almost 
sure convergence conditions are different. 


(X - / 4 U) ) (X - /*£°) H ip u (X); P, 


( 8 ) 


0 < E [it (X); P x \ < oo, 
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Proposition 2 (Strongly consistent estimate of the MT-covari- The following proposition states a sufficient condition for 
ance). Let X (n), n = 1,... ,7V denote a sequence of i.i.d. boundedness of (17). This condition is satisfied for the Gaus- 
samples from If, and define the empirical covariance estimate sian MT-function proposed in Section III. 


N 


± Y, ( X M - } ) (x (n) - p u (X (n)) 

n—1 

( 10 ) 


where 


and 


N 


Ax } <Pu (X (n)) 

n= 1 

u (X (n)) 


(X (n)) = 


N 

E«(X (n)) 


If 


E 


|X||^(X);P x 


< oo, 


( 11 ) 


( 12 ) 


(13) 


where || - || 2 denotes the l 2 -norm, then S x ^ almost 

surely (a.s.) as N —> oo. [A proof is given in Appendix B] 

Note that for zi(X) = 1 the estimator ^S x reduces 

- (u) 

to the standard unbiased SCM. Also notice that X) x can 
be written as a statistical functional of the empirical 

probability measure P x defined in (3), i.e.. 



where 


E[(X - r?^[Px])(X - r) ^ } [Px]) g ;Px] 

E[«(X);P x ] 



(u) 


[A] 


A 


E[Xu (X); P x ] 
E[u (X); P x ] 


(15) 


By (6), (8) and (14), when P x is replaced by P x we have 
^ f x u ' ) [Px] = SOO, which implies that £ x * is Fisher consistent 
[23], 


Proposition 3. The influence function (17) is bounded if the 
MT-function u( y) and the product u( y) ||yHi are bounded over 
C p . [A proof is given in Appendix D] 


E. The robust MT-MUSIC procedure 

In robust MT-MUSIC the measure transformation (5) is 
applied to the probability distribution P x of the array output 
X (n) (1). The MT-function u (•) is selected such that the 
following conditions are satisfied: 

- (li) 

1) The resulting empirical MT-covariance X x is B-robust. 

2) Let A^ > ••• > A p U ) denote the eigenvalues of 
The p — q smallest eigenvalues of satisfy: 

= = (18) 


and their corresponding eigenvectors span the null-space 
of A H , also called the noise subspace. 

Let V(“) £ Cp x (p-<?) denote the matrix comprised of p — q 

- (u) 

eigenvectors of 5] x corresponding to its smallest eigenvalues. 
The DOAs are estimated by finding the q highest maxima of 
the measure-transformed pseudo-spectrum: 

-2 


P ( “)(61) = V ( “ )H a (9) 


(19) 


By modifying the MT-function u(-) such that the stated 
conditions 1 and 2 are satisfied a family of robust MT- 
MUSIC algorithms can be obtained. In particular, by choosing 
u (x) oc ||x|| 9 2 one can verify using (8) that for zero- 
centered symmetric distributions the resulting MT-covariance 
is proportional to the sign-covariance, proposed for the robust 
MUSIC generalization in [12], Another particular choice of 
MT-function leading to the Gaussian MT-MUSIC algorithm is 
discussed in the following section. 


D. Robustness of the empirical MT-covariance 

Here, we study the robustness of the empirical MT- 

'■(u) 

covariance X) x using its matrix valued influence function 
[13]. Define the probability measure P e = (1 — e)P x + ed y , 
where 0 < e < 1, y £ C p , and S y is the Dirac probability 
measure at y. The influence function of a Fisher consistent 
estimator with statistical functional H[-] at probability distri¬ 
bution P x is defined as [13]: 


IF h (y;P X ) = lim 

e ->0 


H [P e ] - H [P x ] 
e 


dH [P e ] 
de 


e=0 


(16) 


The influence function describes the effect on the estimator of 
an infinitesimal contamination at the point y. An estimator is 
said to be B-robust if its influence function is bounded [13]. 

Using (14) and (16) one can verify that the influence function 

~ ( u ) . 

of E x is given by 


IF„<») (y;Px) 


u (y)[(y - Hx } )(y - ] ) h 

E[ix(X);P x ] 



( 17 ) 


III. The Gaussian MT-MUSIC 

In this section, we parameterize the MT-function u(-;r), 
with scale parameter r £ R ++ under the Gaussian family 
of functions centered at the origin. This results in a B-robust 
empirical MT-covariance matrix that rejects large outliers. Un¬ 
der the assumption of spherically contoured noise distribution, 
we show that the noise subspace can be determined from the 
eigendecomposition of the MT-covariance. Choice of the scale 
parameter r is also discussed. 

A. The Gaussian MT-function 

We define the Gaussian MT-function uq (■; •) as 

u G (x; r) = (ttt 2 ) P exp (-||x|| 2 /t 2 ) , r £ R++. (20) 

Using (6)-(8) and (20) one can verify that the resulting Gaus¬ 
sian MT-covariance always takes finite values. Additionally, 
notice that the Gaussian MT-function satisfies the condition 
(13) in Proposition 2, and therefore, the empirical Gaussian 
MT-covariance, based on i.i.d. samples from any probability 
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distribution P x , is strongly consistent. For any fixed scale 
parameter r, the Gaussian MT-function also satisfies the 
condition in proposition 3, resulting in a B-robust empirical 
Gaussian MT-covariance E x (r). The following proposi¬ 
tion, which follows directly from (17) and (20), states that the 
Frobenius norm of the corresponding influence function ap¬ 
proaches zero as the contamination norm approaches infinity. 


Proposition 4 . For any fixed scale parameter r of the Gaus¬ 
sian MT-function (20), the influence function of the resulting 
empirical Gaussian MT-covariance satisfies 


IF 


X“g) 


(y; ^x) 


Fro 


0 as ||y|| 2 ->• oo, (21) 


where ||-|L denotes the Frobenius norm. [A proof is given 
in Appendix E] 


Thus, unlike the SCM and other robust covariance ap¬ 
proaches, the empirical Gaussian MT-covariance rejects large 
outliers. This property is illustrated in Fig. 1 for a standard 
bivariate complex normal distribution, as compared to the 
empirical sign-covariance, Tyler’s scatter M-estimator, and the 
SCM. 



!!yl! 


Fig. 1. Frobenius norms of the influence functions associated with the 
empirical Gaussian MT-covariance for r = 1, r = 1.5 and r = 2, Tyler’s 
scatter M-estimator, the empirical sign-covariance, and the SCM, versus the 
contamination norm, for a bivariate standard complex normal distribution. 
Notice that the influence function approaches zero for large ||y|| only for the 
proposed Gaussian MT-covariance estimator, indicating enhanced robustness 
to outliers. 


Notice that as the scale parameter r of the Gaussian MT- 
function (20) approaches infinity, the corresponding empirical 
Gaussian MT-covariance £ x G (r) approaches the non-robust 
standard SCM E x , whose influence function is unbounded. 
On the other hand, as r decreases it can be shown using the 
upper bound in (50) that the influence function of S x (r) 
has a faster asymptotic decay, as illustrated in Fig. 1, i.e., the 
empirical Gaussian MT-covariance becomes more resilient to 
large outliers. However, we note that this may come at the 
expense of information loss. The trade-off between robustness 
and information loss is discussed in Subsection III-D. 

B. The Gaussian MT-covariance for spherically distributed 
noise 

We assume that the noise component in (1) has a complex 
spherically contoured distribution, also known as a spherical 


distribution [15] having stochastic representation: 

W (n) = v (n) £ (n) , (22) 

where v (n) = p (n) / ||£ (n) || 2 , p{n) £ M++ is a first- 
order stationary process, and f (n) £ C p is a proper-complex 
wide-sense stationary Gaussian process with zero-mean and 
unit covariance, which is statistically independent of p(n). 
The stochastic representation (22) is a direct consequence of 
the following properties [15]: 1) Any spherically distributed 
complex random vector W can be represented as W = pU, 
where p is a strictly positive random variable, and U is uni¬ 
formly distributed on the complex unit sphere and statistically 
independent of p. 2) Any random vector U that is uniformly 
distributed on the complex unit sphere can be represented as 
U = C/IICII 2 > where £ is a complex random vector with 
zero-mean spherically contoured distribution, for example a 
complex Gaussian random vector with zero-mean and unit 
covariance. 

The structure of the resulting Gaussian MT-covariance of 
the array output is given in the following theorem. 

Theorem 1. Under the array model (1) and the spherical 
noise assumption (22), the Gaussian MT-covariance ofX.(n) 
takes the form: 

*4“ g) (r) = AE“ (r) A h + a 2 J£ (r) I, (23) 

where (r) is a non-singular covariance matrix of the 

scaled signal component a 2 (n) S ( n), a (n) = \J 
under the transformed joint probability measure Qa,s with 
the MT-function g(a,S- 1 r) = (^-) P exp(—or1| AS|| 2 /t 2 ). 
The term a^w (t), multiplying the identity matrix I, is the 
variance of the scaled noise component a (n ) W (n) under 
the transformed joint probability measure with the MT- 

function h(a\r) = E [g (a, S; r); P s ], [A proof is given in 
Appendix F] 

Thus, by the structure (23) and the facts that the steering ma¬ 
trix A has full column rank and the MT-covariance X ( v, (r) 
is non-singular, we conclude that Condition 2 in Subsection 
II-E is satisfied. 

C. The Gaussian MT-MUSIC algorithm 

The empirical Gaussian MT-covariance is B-robust, and, 
under the spherical noise assumption (22), the noise subspace 
can be determined from the eigendecomposition of the Gaus¬ 
sian MT-covariance. The Gaussian MT-MUSIC algorithm is 
implemented by replacing the MT-function in (19) with the 
Gaussian MT-function (20). 


D. Choosing the scale parameter of the Gaussian MT-function 

While de-emphasizing non-informative outliers, e.g., caused 
by heavy-tailed distributions, the empirical Gaussian MT- 
covariance is less informative than the standard sample- 
covariance when the observations are normally distributed. 
This is seen in the following theorem that follows from the 
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Gaussian Fisher information formula [33] and elementary trace 
inequalities [34], 

Theorem 2. Assume that the probability distribution P x of the 
array outputs (1) is proper complex normal. The ratio between 
the Fisher information for estimating dk £ {9\, • • ■, 9 q } under 
the transformed probability measure Q* G ^ (with the MT- 
function (20)) and the corresponding Fisher information under 
P x satisfy: 

t 4 ^ F(9 k ;Q^ G) ) ^ t 4 

(A max (X x )+r 2 ) 2 - P(0 fc ;P x ) - (A min (X x ) + r 2 ) 2 ’ 

(24) 

where A m i n (•) and A max (•) are the minimum and maximum 
eigenvalues, respectively. [A proof is given in Appendix G] 

Therefore, in order to prevent a significant transform- 
domain Fisher information loss when the observations are 
normally distributed, we propose to choose the following safe¬ 
guard scale parameter: 

T = \/ cA max (X x ), (25) 

where c is some positive constant that guarantees that the 
Fisher information ratio (24) is greater than (c/(l + c)) 2 . 
Since in practice X x is unknown, it is replaced by the 
following empirical robust estimate that is based on its relation 
(55) to the Gaussian MT-covariance for normally distributed 
observations: 

S x = t 2 S x ” g) (r) (r 2 I - si“ G) (r)) , (26) 

where the empirical Gaussian MT-covariance X x C ’’ (t) 
is obtained using (10), and r must be greater than 
A ma x ^S x G ' > (r)'j in order to guarantee positive definiteness 

of X x . Therefore, substitution of (26) into (25) results in the 
following data-driven selection rule: 


r = ^(c+ l)A max (s x G) (r)), (27) 

which can be solved numerically, e.g., using fixed-point iter¬ 
ation [41], 

In the general case, when the observations are not neces¬ 
sarily Gaussian, the selection rule (25) controls the amount of 
second-order statistical information loss caused by the measure 
transformation. Increasing the constant c increases the scale 
parameter r and reduces the information loss, while on the 
other hand, makes the estimator more sensitive to large-norm 
outliers, as illustrated in Fig. 1. 

IV. Estimation of the number of signals 

We estimate the number of signals using a measure- 
transformed version of the minimum description length (MDL) 
criterion [24], called MT-MDL, that is obtained by replacing 
the eigenvalues of the SCM with the eigenvalues of the 
empirical MT-covariance. The MT-MDL criterion takes the 


form: 


MDL (u) ( k ) = - log 



( p—k)N 


(28) 


+ ^ k{2p-k)\°gN, 

where A^ > ... > A ^ denote the eigenvalues of X x * and 
N is the number of observations (snapshots). The estimated 
number of signals, q, is obtained by minimizing (28) over 
k £ {0,... ,p — 1}. 

Under the conditions that the eigenvalues of the SCM 
are strongly consistent with asymptotic convergence rate of 
O (^JN- 1 log log Tvj and that the p — q smallest eigenvalues 
of the covariance matrix are equal and separated from its 
q largest eigenvalues, it has been shown in [35] that mini¬ 
mization of the MDL criterion leads to strongly consistent 
estimates of the number of signals for any underlying proba¬ 
bility distribution of the data. Thus, when the eigenvalues of 
X x and X x ' 4 satisfy these conditions, namely X^' 1 converges 
almost surely to * for all k = 1 ,... ,p with the same 
asymptotic convergence rate as the eigenvalues of the SCM, 
and the eigenvalues of X^ satisfy (18), the resulting MT- 
MDL based estimator, q, must be strongly consistent. This 
rationale is used for proving the following Theorem that states 
a sufficient condition for strong consistency of the estimator 

q- 

Theorem 3. Let X (n), n = 1 ,... ,N denote a sequence of 
i.i.d. samples from the probability distribution P x of the array 
output (1), with MT-covariance X^ whose eigenvalues satisfy 
(18). If 

E [u 2 (X); P x ] < oo and E [||X ||2 u 2 (X); P x ] < oo, 

(29) 

then q —» q a.s. as N —> oo. [A proof is given in Appendix I] 


Notice that the Gaussian MT-function (20) always satisfies 
the condition (29). Lurthermore, as shown in subsection III-B, 
the Gaussian MT-covariance X 'f G 1 (r) satisfies Condition 2 
in Subsection II-E for spherically distributed noise. Therefore, 
in this case, minimization of the MT-MDL criterion with the 
Gaussian MT-function (Gaussian MT-MDL) results in robust 
and strongly consistent estimate of the number of signals. 
We propose to choose the scale parameter r of the Gaussian 
MT-function using the same selection rule (27) that prevents 
significant information loss for estimation of the DOAs (model 
parameters), and does not require any knowledge about the 
number of signals (model order). The idea of estimating 
the DOAs and the number of signals using the same scale 
parameter r, i.e., under the same transformed probability 
measure, arises from the intuition that if there is no significant 
information loss for estimating the model parameters, then 
there will be no significant information loss for estimating 
the model order. 
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V. The spatially smoothed Gaussian MT-MUSIC 

FOR COHERENT SIGNALS 

In this section, we consider the case of coherent signals 
contaminated by spherically distributed noise. In this scenario, 
the components of the latent vector S (n) in (1) are phase- 
delayed amplitude-weighted replicas of a single first-order 
stationary random signal s(n), i.e., 

S (n) = £s (n ), (30) 


where £ £ C q is a vector of deterministic complex attenuation 
coefficients. Similarly to the standard covariance E x , the noise 
subspace cannot be determined from the eigendecomposition 
of the Gaussian MT-covariance (r), and therefore, the 

Gaussian MT-MUSIC will fail in estimating the DOAs. For¬ 
tunately, similarly to [37], we show that for uniform linearly 
spaced array (ULA) [25] the DOAs can be determined using 
a spatially smoothed version of the Gaussian MT-covariance. 

We partition (r) into L = p — r + 1 overlapping 

forward and backward square sub-matrices of dimension r < 
q. The entries of the Z-th forward sub-matrix are given by 


Cff (r) 


j,k 


G) (r) 


l-\-j — l,l-\-k—l 


( 31 ) 


j, k = 1,..., r, and the entries of the Z-th backward sub-matrix 
are given by 


Gi; z G) (r) 



* 

p—l—j+2,p—l—k+2 


(32) 


j,k = 1 where (•)* denotes the complex conjugate. 

These matrices correspond to overlapping forward and back¬ 
ward subarrays of size r, respectively. The forward spatially 
smoothed Gaussian MT-covariance is defined as the average 
over the L forward sub-matrices: 

( 33 ) 

n i=i 


Similarly, the backward spatially smoothed Gaussian MT- 
covariance is defined as the average over the L backward sub¬ 
matrices: 

ci" G) (r)4l^c(“ a) (r). (34) 

n i=i 

The forward/backward spatially smoothed Gaussian MT- 
covariance matrix is given by 

C<$> (r) ^ I (cj*°> (r) + C< ug) (r)) . (35) 

We define B = [b (9i ),..., b (9 q )} £ C rxg as the steering 
matrix of a ULA with r < p sensors, where b ( 6 ) = 

[l, e -* 27r ( rf / A ) sin ( 6 ') ) . . . ; e -i2ir(r-l)(d./\)sin(9)^ T j g arra y 

steering vector toward direction 9, and cl is the sensors spacing, 
i.e., B is a sub-matrix of the steering matrix A in (1) 
comprised of its first r rows. The following Proposition states 
sufficient conditions under which the null-space of B ;/ can be 
determined from the eigendecomposition of Cy“ G ^ (r) (35). 
The same conditions were proved in [37] for the spatially 
smoothed SCM. 


Proposition 5. Define H = Gy 1 G/„ where Gy = diag(£), 
Gt = diag(<5), S = (D p-1 £)*, and D ; is the l-th power of 
the diagonal matrix D = diag ([u ($i),..., v ((Zq)]), v (9) = 
exp (— i2n{d/X) sin (9)). Additionally, let > ••• > 

Ar UG ^ denote the eigenvalues of Cyy G ^ (t). If the dimension 
r of the forward and backward sub-matrices (31) and (32) is 
chosen such that the resulting number of sub-matrices in each 
direction (forward or backw’ard) L satisfies: 

1) L > q, or 

2) 2 L > q and the largest subset of equal diagonal entries 
of H is at most of size L, 

then the r—q smallest eigenvalues of Cy“ G ^ (r) satisfy Aq UG ^ > 

A^ g) = ■ ■ ■ = \l UG \ and their corresponding eigenvectors 
span the null-space of . [A proof is given in Appendix J] 


Hence, by proper choice of r, such that either one of the 
stated conditions above is satisfied, the spatially smoothed 
Gaussian MT-MUSIC is obtained by replacing the empirical 
Gaussian MT-covariance (r) with its spatially smoothed 

version Cy“ G -* (r). 

The number of signals is estimated using a measure- 
transformed version of the modified MDL (MMDL) criterion 
used in [38] for cases where forward/backward spatial smooth¬ 
ing is performed. Similarly to (28), this criterion, called here 
Gaussian MT-MMDL, is obtained by replacing the eigenval¬ 
ues of the spatially smoothed SCM with the eigenvalues of 
C<“ G) (t). The Gaussian MT-MMDL criterion takes the form: 


MDL = -log 


11 fl A^ G)> ) ^ 

[m—k -\-1 J 


vh E a: 


(“g) 


m=k -\-1 


(r—k)N 


(36) 


+ -k (2r — k + 1) logiV, 


where A^“ G ' ) > ... > Ar“ G ^ denote the eigenvalues of 
Cy“ G) (r) and N is the number of observations (snapshots). 
The estimated number of signals, q, is obtained by minimizing 
(36) over k £ {0,..., i — 1}. 

Finally, we choose the scale parameter r of the Gaussian 
MT-function using the selection rule (27) with Cy“ G ^ (r) 

instead of ^ (r). 


VI. Numerical examples 

We evaluate and compare the performance of the proposed 
MT-MUSIC DOA estimator and the MT-MDL order estimator. 
What follows is a summary of these comparisons. The DOAs 
estimation performances are evaluated under the assumption 
that the number of signals is known. We perform a separate 
evaluation of the proposed MT-MDL estimator of the num¬ 
ber of signals. We examine scenarios of non-coherent and 
coherent signals. For non-coherent signals, the Gaussian MT- 
MUSIC algorithm is compared to the standard SCM-based 
MUSIC (SCM-MUSIC) [1] and to its robust generalizations 
based on the ZMNL preprocessing (ZMNL-MUSIC) [9], the 













empirical sign-covariance (SGN-MUSIC) [12], Tyler’s scatter 
M-estimator (TYLER-MUSIC) [15], [18] and the maximum- 
likelihood (ML) estimators of scatter corresponding to each of 
the considered non-Gaussian noise distributions (ML-MUSIC) 
[15], [39]. The estimation performance of the number of 
signals using the MT-MDL criterion (28) with the Gaussian 
MT-function is compared to estimators using the standard 
MDL criterion [24] based on the standard SCM (SCM- 
MDL), and the MDL variants based on the SCM of the pre- 
processed data with the ZMNL function (ZMNL-MDL) [10], 
the empirical sign-covariance (SGN-MDL) [12], Tyler’s scatter 
M-estimator (TYLER-MDL) [39] and the ML estimators of 
scatter corresponding to each of the considered non-Gaussian 
noise distributions (ML-MDL) [15], [39]. Lor coherent signals, 
the spatially smoothed (SS) Gaussian MT-MUSIC algorithm, 
discussed in Section V, is compared to the spatially smoothed 
versions of the SCM-MUSIC [37], ZMNL-MUSIC, SGN- 
MUSIC [12], TYLER-MUSIC and ML-MUSIC. Estimation 
performance of the number of signals using the Gaussian MT- 
MMDL criterion (36) is compared to those obtained by the 
modified MDL criterion for coherent signals [38] based on the 
forward/backward spatially smoothed versions of the standard 
SCM, the SCM of the preprocessed data with the ZMNL 
function [9], the empirical sign-covariance [12], Tyler’s scatter 
M-estimator and the ML estimators of scatter corresponding 
to each non-Gaussian noise distribution considered in the 
simulation examples. 

We consider the following p -variate complex spherical 
compound Gaussian noise distributions with zero location 
parameter and isotropic dispersion rr'^I: Gaussian, Cauchy, 
if-distribution with shape parameter v = 0.75, and compound 
Gaussian distribution with inverse Gaussian texture and shape 
parameter A = 0.1. Notice that unlike the Gaussian distribu¬ 
tion, the other noise distributions are heavy tailed. Random 
sampling from the considered noise distributions and their 
applicability for modelling radar clutter are discussed in detail 
in [15] and [29]. Let cr| , k = 1 ,,q denote the variances 
of the received signals, the generalized signal-to-noise-ratio 
(GSNR) is defined as GSNR = 101og 10 \ £Li cr 2 s Jal, and 
is used to index the estimation performance. 

Lollowing the approach proposed in subsection III-D, for 
non-coherent signals, we select the scale parameter r of the 
Gaussian MT-function (20) as the solution of (27) with c = 5. 
This choice of the constant c guarantees relative transform- 
domain Lisher information loss of no more than « 30%. 
The solution of (27) is obtaine d using fixed -point iteration 
with initial condition tq = 5Sfc=i ^x k ’ w h ere &x k = 

7 2 [(MAD({Re(X fe , n )}£U)) 2 V(MAD({Im(X fe , n )}^ =1 )) 2 ], 
7 = 1/erf 1 (3/4), is a robust median absolute deviation 
(MAD) estimate of variance [17], The maximum number 
of iterations and the stopping criterion were set to 100 and 
1 77 — ti-\\/ti-i < 1CU 6 , respectively, where l is an iteration 
index. Lor coherent signals, we replaced the empirical Gaus¬ 
sian MT-covariance in (27) with its spatially smoothed version 
and applied the same selection procedure for r. 

The maximum number of iterations and the stopping crite¬ 
rion in Tyler’s scatter M-estimator and the ML estimators of 


scatter were set to 100 and 


- Tyler/ML ^Tyler/ML ^Tyler/ML 

11Fro/11 ^x,Z —1 


Fro < 10 6 , 


respectively. 

In all examples, the performances versus GSNR were eval¬ 
uated for N = 1000 i.i.d. snapshots. The performances versus 
the number of snapshots were evaluated at the threshold GSNR 
point obtained by the Gaussian MT-MUSIC algorithm for N = 
1000 i.i.d. snapshots. The parameter space 0 = [—90°, 90°) 
was sampled uniformly with sampling interval A = 0.0018°. 
All performance measures were averaged over 10 4 Monte- 
Carlo simulations. 


A. Non-coherent signals 

In this example, we considered five independent 4-QAM 
signals with equal power <r| impinging on a 16-element 
uniform linear array with A/2 spacing from DOAs 6\ = —10°, 
02 = 0°, 0 3 = 5°, 04 = 15°, and 05 = 35°. The average 
RMSEs for estimating the DOAs and the error rates for esti¬ 
mating the number of signals versus GSNR and the number of 
snapshots are depicted in Ligs. 2-5 for each noise distribution. 
Notice that for the Gaussian noise case, there is no significant 
performance gap between the compared methods. Lor the 
other noise distributions, the proposed Gaussian MT-MUSIC 
and the Gaussian MT-MDL based estimation of the number 
of signals outperform all other robust MUSIC and MDL 
generalizations in the low GSNR and low sample size regimes, 
with significantly lower threshold regions. This performance 
advantage may be attributed to the fact that unlike the em¬ 
pirical sign-covariance, Tyler’s scatter M-estimator, and the 
ML estimators of scatter corresponding to each non-Gaussian 
noise distribution, the influence function of the empirical 
Gaussian MT-covariance is negligible for large norm outliers 
(as illustrated in Lig. 1), which are likely in low GSNRs 
and become more defective when the sample size decreases. 
Lurthermore, unlike the ZMNL preprocessing based technique, 
the proposed measure-transformation approach preserves the 
noise subspace and effectively suppresses outliers without 
significant information loss for estimating the DOAs and the 
number of signals. 

B. Coherent signals 

In this example, we considered five coherent signals imping¬ 
ing on a 22-element uniform linear array with A/2 spacing 
from DOAs 6 1 = -17°, 0 2 = -3°, 0 3 = 2°, 0 4 = 13° 
and 05 = 20°. The signals were generated according to the 
model (30), where s (n) is a 4-QAM signal with power o |. 
The attenuation coefficients were set to 771 = 0.8 exp(i 7 r/ 3 ), 
V 2 = 1, rj 3 = 0 . 9 exp(* 7 r/ 4 ), 774 = 0 . 7 exp(* 7 r/ 5 ) and 
775 = 0.6 exp(i 7 r/ 6 ). The dimension of the spatially smoothed 
covariance was set to r = 16. The average RMSEs for 
estimating the DOAs and the error rates for estimating the 
number of signals versus GSNR are depicted in Ligs. 6-9 for 
each noise distribution. Notice that for the Gaussian noise 
case, there is no significant performance gap between the 
compared MUSIC algorithms. Regarding the estimation of the 
number of signals, the sign-covariance based modified MDL 
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Fig. 2. Non-coherent signals in Gaussian noise: (a) Average RMSE versus 
GSNR. (b) Probability of error for estimating the number of signals versus 
GSNR. (c) Average RMSE versus the number of snapshots, (d) Probability 
of error for estimating the number of signals versus the number of snapshots. 
The performance measures versus GSNR were evaluated for N = 1000 
i.i.d. snapshots. The performance measures versus the number of snapshots 
were evaluated for GSNR = —11 [dB]. Notice that all algorithms perform 
similarly. 


Fig. 4. Non-coherent signals in K-distributed noise with shape parameter 

v = 0.75: (a) Average RMSE versus GSNR. (b) Probability of error for 
estimating the number of signals versus GSNR. (c) Average RMSE versus 
the number of snapshots, (d) Probability of error for estimating the number 
of signals versus the number of snapshots. The performance measures versus 
GSNR were evaluated for N = 1000 i.i.d. snapshots. The performance 
measures versus the number of snapshots were evaluated for GSNR = —19 
[dB]. Notice that the Gaussian MT-MUSIC outperforms all other compared 
algorithms in the low GSNR and low sample size regimes. Also notice that 
the Gaussian MT-MDL criterion leads to significantly lower error rates for 
estimating the number of signals. 










Fig. 3. Non-coherent signals in Cauchy noise: (a) Average RMSE versus 
GSNR. (b) Probability of error for estimating the number of signals versus 
GSNR. (c) Average RMSE versus the number of snapshots, (d) Probability 
of error for estimating the number of signals versus the number of snapshots. 
The performance measures versus GSNR were evaluated for N = 1000 
i.i.d. snapshots. The performance measures versus the number of snapshots 
were evaluated for GSNR = —11 [dB]. Notice that the Gaussian MT- 
MUSIC outperforms all other compared algorithms in the low GSNR and 
low sample size regimes. Also notice that the Gaussian MT-MDL criterion 
leads to significantly lower error rates for estimating the number of signals. 


Fig. 5. Non-coherent signals in spherical compound Gaussian noise 
with inverse-Gaussian texture and shape parameter A = 0.1: (a) Average 
RMSE versus GSNR. (b) Probability of error for estimating the number of 
signals versus GSNR. (c) Average RMSE versus the number of snapshots, (d) 
Probability of error for estimating the number of signals versus the number 
of snapshots. The performance measures versus GSNR were evaluated for 
N = 1000 i.i.d. snapshots. The performance measures versus the number of 
snapshots were evaluated for GSNR = —22 [dB]. Notice that the Gaussian 
MT-MUSIC outperforms all other compared algorithms in the low GSNR and 
low sample size regimes. Also notice that the Gaussian MT-MDL criterion 
leads to significantly lower error rates for estimating the number of signals. 
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criterion results in better estimation performance. This may be 
attributed to the fact that in the sign-covariance based modified 
MDL criterion [12] the eigenvalues are estimated in a more 
stable manner. For the other noise distributions, the spatially 
smoothed Gaussian MT-MUSIC and the Gaussian modified 
MT-MDL based estimation of the number of signals outper¬ 
form all other robust MUSIC and MDL generalizations in the 
low GSNR and low sample size regimes, with significantly 
lower breakdown thresholds. Again, as in the non-coherent 
case, this performance advantage may be attributed to the 
following facts. First, unlike the empirical sign-covariance, 
Tyler’s scatter M-estimator and the ML-estimators of scatter 
corresponding to each non-Gaussian noise distribution, the 
influence function of the empirical Gaussian MT-covariance 
is very small for large norm outliers. Such outliers are likely 
in low GSNRs and become more frequent when the sample 
size is small. Second, unlike the ZMNL preprocessing based 
technique, the proposed measure-transformation approach pre¬ 
serves the noise subspace and effectively suppresses outliers 
without significant performance loss in estimating the DOAs 
and the number of signals. 



(a) 



(c) 



(b) 



(d) 


Fig. 6. Coherent signals in Gaussian noise: (a) Average RMSE versus 
GSNR. (b) Probability of error for estimating the number of signals versus 
GSNR. (c) Average RMSE versus the number of snapshots, (d) Probability 
of error for estimating the number of signals versus the number of snapshots. 
The performance measures versus GSNR were evaluated for N = 1000 i.i.d. 
snapshots. The performance measures versus the number of snapshots were 
evaluated for GSNR = —12 [dB], Notice that there is no significant perfor¬ 
mance gap between the compared MUSIC algorithms. The sign-covariance 
based modified MDL criterion results in better estimation of the number of 
signals. This may be attributed to the fact that the sign-covariance based 
modified MDL criterion [12] involves more stable eigenvalues estimation. 


VII. Conclusion 

In this paper, a new framework for robust MUSIC was 
proposed that applies a transform to the probability distribution 
of the data prior to forming the sample covariance. Under 
the assumption of spherically contoured noise distribution, a 



(a) 



(c) 



(b) 



(d) 


Fig. 7. Coherent signals in Cauchy noise: (a) Average RMSE versus 
GSNR. (b) Probability of error for estimating the number of signals versus 
GSNR. (c) Average RMSE versus the number of snapshots, (d) Probability 
of error for estimating the number of signals versus the number of snapshots. 
The performance measures versus GSNR were evaluated for N = 1000 i.i.d. 
snapshots. The performance measures versus the number of snapshots were 
evaluated for GSNR = —14 [dB]. Note that similarly to the non-coherent 
case, the Gaussian MT-MUSIC estimator has significantly lower GSNR and 
sample size threshold regions than the other methods. Also notice that the 
Gaussian MT-MMDL estimator of the number of signals outperforms all other 
MDL based estimators with significantly lower probability of error at the low 
GSNR and low sample size regimes. 


new robust MUSIC algorithm, called Gaussian MT-MUSIC, 
was presented based on a Gaussian shaped measure transform 
(MT) function. Furthermore, a new robust generalization of 
the MDL criterion for estimating the number of signals, 
called MT-MDL, was derived that is based on replacing the 
eigenvalues of the SCM with those of the empirical MT- 
covariance. The proposed Gaussian MT-MUSIC algorithm 
was extended to the case of coherent signals by applying 
spatial smoothing to the empirical Gaussian MT-covariance. 
Exploration of other classes of MT-functions may result in 
additional robust MUSIC algorithms that have different useful 
properties. Furthermore, extending the MT-MDL criterion to 
sample-starved scenarios [44] or to cases where there is 
additional information on the sample eigenvalues distribution 
[45] are worthwhile topics for future research. 

Appendix 

A. Proof Proposition 1: 

Property 1: 

Since ip u (x) is nonnegative, then by Corollary 2.3.6 in [31] 
c/'-f 1 is a measure on S x . Furthermore, (X) = 1 so that 
Q , 'f ] is a probability measure on S x . 

Property 2: 

Follows from definitions 4.1.1 and 4.1.3 in [31]. 

Property 3: 

Equivalently, we show that if the covariance of g (X) under 
Qx is singular, then it must be singular under P x . 
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(a) 



(c) 



(b) 



(d) 



(a) 



(c) 



(b) 



(d) 


Fig. 8. Coherent signals in K-distributed noise with shape parameter v = 

0.75: (a) Average RMSE versus GSNR. (b) Probability of error for estimating 
the number of signals versus GSNR. (c) Average RMSE versus the number 
of snapshots, (d) Probability of error for estimating the number of signals 
versus the number of snapshots. The performance measures versus GSNR 
were evaluated for N = 1000 i.i.d. snapshots. The performance measures 
versus the number of snapshots were evaluated for GSNR = —25 [dB]. Note 
that similarly to the non-coherent case, the Gaussian MT-MUSIC estimator 
has significantly lower GSNR and sample size threshold regions than the other 
methods. Also notice that the Gaussian MT-MMDL estimator of the number 
of signals outperforms all other MDL based estimators. 


According to (7), the covariance of g (X) under Q^ is 
given by: 


g( x ) 


E 


(g(X) 


/4“x))te( x ) 



where — E [g (X) (p u (X); P x ] is the expectation of 

g (X) under Q^\ Since is singular, there exists a 

non-zero vector a E C m such that 


Fig. 9. Coherent signals in spherical compound Gaussian noise with 
inverse-Gaussian texture and shape parameter A = 0.1: (a) Average 
RMSE versus GSNR. (b) Probability of error for estimating the number of 
signals versus GSNR. (c) Average RMSE versus the number of snapshots, (d) 
Probability of error for estimating the number of signals versus the number 
of snapshots. The performance measures versus GSNR were evaluated for 
N = 1000 i.i.d. snapshots. The performance measures versus the number of 
snapshots were evaluated for GSNR = —24 [dB]. Note that similarly to the 
non-coherent case, the Gaussian MT-MUSIC estimator has significantly lower 
GSNR and sample size threshold regions than the other methods. Also notice 
that the Gaussian MT-MMDL estimator of the number of signals outperforms 
all other MDL based estimators with significantly lower probability of error 
at the low GSNR and low sample size regimes. 


B. Proof Proposition 2: 

By the real-imaginary decompositions of the MT-covariance 
and its empirical version \ given in Lemmas 1 and 
2 in Appendix C, it is sufficient to prove that ~^ 

a.s. as N —> oo. According to (10)-(12) 


a 


H 


g(x) 


a = E 




Therefore, by (2), (6), the strict positiveness of u (•) and 
Proposition 2.3.9 in [31] 


E 



(u 

g(x 


) 



= 0. 


(37) 


The covariance of g (X) under P x given by 


X 


g(x) 




where ft g ( x ) = E [g (X); P x ] is the expectation of g (X) 
under P x . Hence, one can verify that 




g(x) 


H 


= E 


ftg(, 

H (g(X) 


^g(x) , 


(38) 


(u) 

/ X g(x) 


;P X 


= o, 


where the last equality stems from (37). Since the second 
summand in the l.h.s. of (38) is nonnegative we conclude that 
a fl S g ( x ja = 0, which implies that X g ( x ) is singular. □ 


lim = lim ^ Z (n) Z r (n) (p g (Z (n)) 

N —Vrvi /V —Vno /V ‘ ■* 

n— 1 

.(g) 


1 


N 


N—too 


where 


and 


— lim fi y z > lim ui 9 ' >T , 

AT-s-oo AT-s-oo 


5Z Z ( n ) ZT 09 ( Z M) 

n— 1 
N 

Jim jt E Z(n)Z T (n)g(Z(n)) 


N 


N—too 


n—1 


Jim jf E (Z(n)) 
n=1 


N 


J lm jf E ^{n)g{Z{n)) 

lim u (u) - _2=1_ 


N- 


Jim ^ E 9 (Z (n)) 


(39) 


(40) 


(41) 


In the following, the limits of the series in the r.h.s. of (40) 
and (41) are obtained. Since {Z (n)}^ =1 is a sequence of 
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i.i.d. samples of Z, then the random matrices 
{Z (n) Z T (n) g (Z (n ), Z (n ))}^ f _ 1 in the r.h.s. of (40) 
define a sequence of i.i.d. samples of ZZ T g (Z). Moreover, 
if the condition in (13) is satisfied, then for any pair of 
entries Z k , Zi of Z we have that 


C. Real-imaginary decomposition of the MT-covariance and 
its empirical estimate 

The real-imaginary decomposition of the MT-covariance of 
X under Qx is given in the following Lemma that follows 
directly from ( 6 ), ( 8 ) and (9). 


< 

< 


E 

Z k Zig(Z)\-P z ] 


E 

Z k g 1/2 (Z) 

Zig 1/2 (Z) 

;P Z 


(E [\Z k \ 2 g(Z)- 


P 2 


E IIXH 2 u (X); P x 


E 


|Z ; | 2 3 (Z);P z ]) 


1/2 


< 00 , 


Lemma 1. Let U and V denote the real and imaginary 
components of X, respectively. Define the real random 
vector Z = [U T , V T J T and the MT-function g (Z) = u (X). 


Let 


Xi 9) , T,i 9) 


£l 9) 


and 


sl 9) 


denote 


the p x p submatrices of the real-valued MT-covariance £ z 9 ' 1 
satisfying 


where the first semi-inequality stems from the Holder 
inequality for random variables [31], and the second one 
stems from the definitions of Z and <7 (Z) in Lemma 1 in 
Appendix C. Therefore, by Khinchine’s strong law of large 
numbers (KSLLN) [30] 

1 N 

l im T7 E Z ( n ) ZT ( n ) 9 ( Z ( n )) = E i ZZT 9 (Z); Pz] a.s. 

N—foo iV z ' 
n— 1 

(42) 

Similarly, it can be shown that if the condition in (13) is 
satisfied, then by the KSLLN 

f N 

Jim -Vz(n)(?(Z(n))=E[Z fl (Z);P z ] a.s., (43) 

N —>00 iv *—' 
n= 1 


and 

1 N 

lim T 7 E 9 ( Z ( n )) = E b ( z ); p z] a.s. (44) 

N —>00 iv z —/ 


Remark 1. By (44), the definition of g (Z) in Lemma 1 in 
Appendix C, and the assumption in (4) the denominator in 
the r.h.s. of (40) and (41) is non-zero almost surely. 

Therefore, since the sequences in the l.h.s. of (40) and (41) 
are obtained by continuous mappings of the elements of the 
sequences in their r.h.s., then by (42)-(44), and the 
Mann-Wald Theorem [32] 


lim — V Z (n) 7? 

N —>00 TV K ’ 

n— 1 


N 


(n) fig (Z (n)) 


E [ZZ T g (Z); P z ] 
E [g (Z); P z ] 


E [ZZ T tp g (Z); P z ] 


(45) 


a.s. 


and 


lim 

N—too 


(g) _ E[Z,g(Z);P z ] 
E[p(Z);P z ] 


E [Zi/ 5 a (Z); P z \ a.s., (46) 


where the last equalities in (45) and (46) follow from the 
definition of tp g (•) in ( 6 ). 

Thus, since the sequence in the l.h.s. of (39) is obtained by 
continuous mappings of the elements of the sequences in its 
r.h.s., then by (45) and (46), the Mann-Wald Theorem, and 
( 8 ) it is concluded that a.s. as TV —>• 00 . □ 


Xi 9) = 


sl 9) 


1,1 


2,1 


xl 9) 

El 9) 


J 1,2 


J 2,2 J 


The real-imaginary decomposition of the MT-covariance of 
X under takes the form: 


sl u) = 


+ 


s i 9) 
sl 9) 


1,1 


e! 9) 


2,2 


(47) 


J 2,1 


J 1,2 


Similarly, the real-imaginary decomposition of the empirical 
MT-covariance of X under (ffi 1 is given in the following 
Lemma that follows directly from ( 10)-( 12). 

Lemma 2. Let X (n), n = 1,..., TV denote a sequence of 
samples from P x , and let U (n), V (n) denote the real and 
imaginary components of X ( n ), respectively. Define the real 


random vector Z (n) = [U T (n ), V T (n)] and the 
MT-function g (Z (n)) = u (X (n)). Let 


fig) 


fig) 


J 2,1 


and 


fig) 


1,1 


fig) 


J 1,2 


2,2 


denote the p x p submatrices of the 

(9) 


real-valued empirical MT-covariance £ z satisfying 


±i 9) = 


fig) 


fig) 


J 1,1 


J 2,1 


fig) 


fig) 


J 1,2 


J 2,2 


The real-imaginary decomposition of the empirical 
MT-covariance ofX under Q^ takes the form: 


fiu) 


fig ) 


fig)' 


(48) 


fig) 


fig ) 


I). Proof Proposition 3: 

The influence function (17) can be written as: 

2 


IT>) (y;Px) = cu{ y) 

^ X 

where c = E _lr 


(n) 

y-pk 


,G (y) - E, 


( 49 ) 


X); P x ], G (y) = fi (y) fi (y) and 

(u) 


fZ’(y) = 




y-p 


(u) 
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Since ||i/> (y)|| 2 = 1 for any y £ C p , the real and imaginary 
components of G (y) in (49) are bounded. Thus, the 
influence function IF («> (y;P x ) is bounded if u (y) and 
2 ' I ' x 

u (y) ||y|| 2 are bounded. □ 


E. Proof Proposition 4: 
According to (17) 


IF r u) (y;P x ) 


Fro 


cu 2 (y) 

(s£°) 


y - p 


(u) 


- 2 


y - p 


(“) 


tr 


< c 


(f/u (y) ||y|| 2 + \Ju{ y) p? J 


(50) 




(y) tr 


(s?)' 


where c = E 2 [m(X); P x ], ||y|| 2 = y H Xy, and the 
semi-inequality follows from the triangle-inequality and the 
positive semi-definiteness of By (20), 


u (y) = u G (y; r) = (ttt 2 ) P 0(r) 


and 

«(y) llylla 



(9 2 </> (r) 
"dr 2 



where r = ||y|| 2 and 4>{r) = exp (—r 2 /r 2 ). Therefore, since 
for any fixed r we have f (r) —>• 0 and c) —> 0 as 

r —>- oo, we conclude that (21) holds. □ 


F. Proof of Theorem 1 

According to (1) and (22), the conditional probability 
distribution, P x | y s , of X given v and S is proper complex 
normal with location parameter /r x ^ s = AS and covariance 
matrix E x |„ iS = v 2 I. Therefore, using (2), (20) and the law 
of total expectation one can verify that: 

E [u G (X; r); P x ] = E \g (a, S; r); P a , s \ = E [h (a; r); P a \, 

(51) 

E [Xu G (X; t) ; P x ] = AE [a 2 S.g (a, S; r); P a , s ] , (52) 

and 

E[XX fl « G (X; t) ; P x ] 

= AE [a 4 SS ff g (a, S; r); P Q , S ] A H 
+ E [a 2 v 2 h (a; t) ; P„] I, (53) 

where a = </ 

g (a, S; r) = ex P (-a 2 || AS||^/r 2 ), 


and h (a; r) = E [g (a, S; r); P s ]. Notice that by (22) the 
expectation in the second summand of (53) can be rewritten 
as: 


E \a 2 v 2 h (a; r); P v ] 


E 


\aW\ h(a,r); P a ,w 


(54) 


|E [aWh (a, r); P a ,w}\ 2 , 


where the scalar IF denotes any of the identically distributed 
components of the noise vector W. Thus, using (6), (8) and 
(51)-(54) the relation (23) is easily obtained. Finally, since 
the MT-function <j{-. •; •) is strictly positive and a 2 S has a 
non-singular covariance under the joint probability measure 
Pq,s, by Property 3 in Proposition 1 the MT-covariance 
E $ 9 (t) must be non-singular. □ 


G. Proof of Theorem 2 

Since P x is proper complex normal with location parameter 
// x = 0 and covariance matrix X x , by (6) and (7) the 
transformed probability distribution () ! f G> generated by the 
Gaussian MT-function (20) is proper complex normal with 
location parameter p!f G> = 0 and covariance matrix 

e£* q >(t)= (S^ + t- 2 !) -1 . (55) 


According to the Gaussian Fisher information formula, stated 
in [33], the Fisher informations for estimating (4 under P x 
and Qx 0 ' 1 are given by: 


F{9 k ;Px) = tr 




de k 


(56) 


and 

F 


(d k -Q[ “ c) )=tr ((e^(t)) 


^ g) (r)' 


de k 


(57) 


respectively. By (55) and the matrix identity 

OCX. 

atrix ani 

9Si“ G) 


^- = CF'fgCF 1 [42], where C is some invertible 
complex matrix and a £ R, we have 


_ v(«g)v- 1 v.-lyi(tl G ) 

d0 k x x dd k x x ■ 


(58) 


Therefore, using (55)-(58) and Lemma 3 in Appendix H we 
obtain that: 


F 

< tr 


(0 fc ;Q^ G) ) 


(59) 




d9 k 


\ 2 

'max 


( S " 1 ^ 3 (7 


= F(9 k -Px) 


7" 2 + A m i n (E x ) 


and 


F 

tr 


(o k '- Q 


(“gIA 


(60) 


d6 k 


A 2 

/X min 


(s x 1 sL UG) 0 


= F(0 k -,p x ) 


, T -2 + A max (E x ) / 

The relations in (24) are obtained from (59) and (60). □ 
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H. Some useful trace inequalities 

Lemma 3. Let A, B and C denote Hermitian matrices with 
the same dimensions, and assume that A and C are 
positive-definite and B is positive-semidefinite. 

tr [AB]A min (AC) <tr [ABAC] < tr [AB] A max (AC), 

( 61 ) 

where A m i n (•) and A max (•) denote the minimal and maximal 
eigenvalues, respectively. 

Proof: By the invariance of the trace operator to 
multiplication order of two matrices, inequalities (I) and (II) 
in [34], and the fact that A 1 / 2 CA 1 / 2 is Hermitian and 
similar to AC we conclude that 


Using (8)-( 11) and (14), one can verify that 
d*i 9) [(l-e)P z + eP z 


de 


E 


E 


ff ( z ); P z 


(pi 9) 


>(Z);P S 

(s) 


e =0 




t\ 1 N 

PL) -dL) =4E h (z(»)) 


n— 1 


tr [ABAC] < tr 


a 1 / 2 ba 1 ( 2 


a 1 / 2 ca 1 ( 2 


(62) 


is a V-statistic [36] with zero-mean kernel 

H(Z) = E]i®kl« Z - Z - ^ ) ) T - By 

condition (29) and the definitions of Z and g (•) in Lemma 1 
in Appendix C, H (Z) must have finite variance entries. 

By (65) and (66) we have that 


= tr [AB] A max (AC) 


Ri 9) = 


E 


1 - 


g( Z);P S 


and 


E [g (Z); P 2 



->(s) 


( 66 ) 





-1 

E 

9 ( z ); Pz 

tr [ABAC] > tr 

a 1 / 2 ba 1 ( 2 

(a^ca 1 / 2 ) 


E 

[ff (Z); P z ] 


(fiP-vP) (a^-/*L 9) ) T . 


= tr [AB] A min (AC), 
where [|-[L denotes the spectral-norm [43]. 

7. Proof of Theorem 3 
Assume that 

*,(s) 


s 

(63) 


Therefore, each entry of satisfies: 


R 


(s) 




< \c\ 

+ \D\ 


fg) 


J k,l 


sl 9) 


7* 


(a) 


,,(g) 


Pz 

l 


J k,l 
a.s., 


(67) 


where 


- E^ = O (^/A -1 log log a) a.s., (64) 


where Y ( f’ and E^ are defined in the real-imaginary 
decompositions of e£^ and E^ \ respectively, in Appendix 
C. Then, - E^ = O (^y/N- 1 log log a) a.s., and 
therefore, by Lemma 3.2 in [35] we obtain that the 
descendingly ordered eigenvalues of E^ 1 and E^ satisfy 
- x[ u) = o (yA- 1 log log a) a.s. for k = 1,... ,p. 

Using this result and applying that AjX\ k = 1,... ,p satisfy 
(18), the strong consistency of q follows directly from the 
proof of Theorem 3.1 in [35]. Therefore, in order to 
complete the proof, we show that under the condition (29) 
the assumption in (64) is satisfied. 

- (p) 

Similarly to (14), the empirical MT-covariance E z can be 
written as a statistical functional Iff] of the empirical 


C = 1 - 


E 

ff (Z); P z 

E 

[ff (Z); P z ] 


( 68 ) 


1 N 

= c-^( 5 (Z(n))-E[ 5 (Z);P z ]), 


n—1 


D = 


E 

ff (Z); P z 

E 

[ff (Z); P z ] 


= c 


( 

b E ( Zk w - 


p 


(s) 


J kt 


(69) 


n—1 


E [Zkg (Z); P z 
E [g (Z); P z ] 


9 ( z («)) 


N 


probability measure P z 


N 


S Zn , where 


¥<»>[P Z ] = S z y A The Taylor expansion of V&^fPz] about 
P z is given by [36]: 


— v(s) 


n=1 


(<?) r 


^ S) [P Z ] =^ 9) [Pz] + 


3¥< ff >[(l-e)P z + eP z ] 


<9e 


1? 


(a) 


e—0 


where R x ,; denotes the reminder term. 


Z ) 

(65) 


c = E _1 [p(Z); P z ], and second equality in (69) stems from 
(9) and (11), with Zk and Zkin) denoting the fc-th entry of 
Z and Z(n), respectively. 

Under the condition (29), and the definitions of Z, Z (n) and 
p (•) in Lemmas 1, 2 in Appendix C, the summands in (68) 
and (69) have finite variances. Therefore, by the i.i.d. 
assumption and the law of iterated logarithm (LIL) [36] we 
have that 

C = 0 [y/N- 1 log log A) a.s. (70) 

and _ 

D = 0 (-/A- 1 log log A) a.s. (71) 

Furthermore, under condition (13) that follows from (29) 


/4 u) and E z ' are strongly consistent, as shown in Appendix 
B. Hence, by (67), (70) and (71), we conclude that the 
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reminder (66) satisfies = o ^ \JN ~ 1 log log N\ a.s. 
Thus, by Theorem 6.4.2 in [36] we conclude that (64) 
holds. 


□ 


J. Proof of Proposition 5 

Similarly to the proof of Theorem 1, it can be shown that 
under the array model (1), the coherent signals model (30), 
and the compound Gaussian noise assumption (22), the 
Gaussian MT-covariance matrix of the array output takes the 
form: 

S^ g) (r) = dd(r) + a 2 J$ (r)I, (72) 

where d = and (r) is the variance of a 2 (n) s (n), 
a — \j T ±\ ■ under the transformed joint probability 

measure Qa) s with the MT-function 

g (a, s; r) = ) ex P (- a2 |l d ll 2 l s | 2 / t 2 )- 

The term cr^w (t) is the variance of a (n) W (n) under the 
transformed joint probability measure v with the 
MT-function h (a; r) = E [g (a, s; r); P s ], 

The Gaussian MT-covariance (72) is structured similarly to 
the standard covariance S x for coherent signals [37], 
Therefore, by the ULA assumption and (31 )-(35), the proof 
follows using the same argumentations in [37] for the 
spatially smoothed version of S x . □ 
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